Chapter 3 Uncertainty Propagation Ramón
نویسندگان
چکیده
While it is often fairly straightforward to estimate the reliability of speech features in the time-frequency domain, this may not be true in other domains more amenable to speech recognition, such as e.g. for RASTA-PLP features or those obtained with the ETSI advanced front end. In such cases, one useful approach is to estimate the uncertainties in the domain where noise-reduction preprocessing is carried out, and to subsequently transform the uncertainties, along with the actual features, to the recognition domain. In order to develop suitable approaches, we will first give a short overview of relevant strategies for propagating probability distributions through nonlinearities. Secondly, for some feature domains suitable for robust recognition, we will show possible implementations and sensible approximations of uncertainty propagation and discuss the associated error margins and trade-offs. 3.1 Uncertainty Propagation In automatic speech recognition (ASR) an incoming speech signal is transformed into a set of observation vectors x = x1 · · ·xL which are input to the recognition model. ASR systems which work with uncertain input data replace this observation set by a distribution of possible observed values conditioned on the available information I, p(x1 · · ·xL|I). This uncertainty represents the lost information that causes the mismatch between the observed speech and the trained ASR model. By combining this distribution with observation uncertainty techniques [23], superior recognition robustness can be attained. There are multiple sources from which this Ramón Fernandez Astudillo Electronics and Medical Signal Processing Group, TU Berlin, 10587 Berlin, e-mail: ramon@ astudillo.com Dorothea Kolossa Electronics and Medical Signal Processing Group, TU Berlin, 10587 Berlin, e-mail: dorothea. [email protected]
منابع مشابه
UNDERSTANDING AND EXPLOITING THE ACOUSTIC PROPAGATION DELAY IN UNDERWATER SENSOR NETWORKS by Affan
xiv Chapter 1: Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Communication Medium for UWSN . . . . . . . . . . . . . . . 3 1.2.2 The Vision of UWSN . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.3 Acoustic Propagation and Sensornet Protocols . . . . . ....
متن کاملPropagation of Interval and Probabilistic Uncertainty in Cyberinfrastructure - Related Data Processing
Data uncertainty affects the results of data processing. So, it is necessary to find out how the data uncertainty propagates into the uncertainty of the results of data processing. This problem is especially important when cyberinfrastructure enables us to process large amounts of heterogeneous data. In the ideal world, we should have an accurate description of data uncertainty, and well-justif...
متن کاملPropagation of Interval and Probabilistic Uncertainty in Cyberinfrastructure - Related Data Processing and Data Fusion
Data uncertainty affects the results of data processing. So, it is necessary to find out how the data uncertainty propagates into the uncertainty of the results of data processing. This problem is especially important when cyberinfrastructure enables us to process large amounts of heterogeneous data. In the ideal world, we should have an accurate description of data uncertainty, and well-justif...
متن کاملUncertainty driven Compensation of Multi-Stream MLP Acoustic Models for Robust ASR
In this paper we show how the robustness of multi-stream multi-layer perceptron (MLP) acoustic models can be increased through uncertainty propagation and decoding. We demonstrate that MLP uncertainty decoding yields consistent improvements over using minimum mean square error (MMSE) feature enhancement in MFCC and RASTA-LPCC domains. We introduce as well formulas for the computation of the unc...
متن کاملA MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation
Uncertainty propagation techniques achieve a more robust automatic speech recognition by modeling the information missing after speech enhancement in the short-time Fourier transform (STFT) domain in probabilistic form. This information is then propagated into the feature domain where recognition takes place and combined with observation uncertainty techniques like uncertainty decoding. In this...
متن کامل